human-ai team
Assessing the Real-World Utility of Explainable AI for Arousal Diagnostics: An Application-Grounded User Study
Kraft, Stefan, Theissler, Andreas, Wienhausen-Wilke, Vera, Kasneci, Gjergji, Lensch, Hendrik
Artificial intelligence (AI) systems increasingly match or surpass human experts in biomedical signal interpretation. However, their effective integration into clinical practice requires more than high predictive accuracy. Clinicians must discern \textit{when} and \textit{why} to trust algorithmic recommendations. This work presents an application-grounded user study with eight professional sleep medicine practitioners, who score nocturnal arousal events in polysomnographic data under three conditions: (i) manual scoring, (ii) black-box (BB) AI assistance, and (iii) transparent white-box (WB) AI assistance. Assistance is provided either from the \textit{start} of scoring or as a post-hoc quality-control (\textit{QC}) review. We systematically evaluate how the type and timing of assistance influence event-level and clinically most relevant count-based performance, time requirements, and user experience. When evaluated against the clinical standard used to train the AI, both AI and human-AI teams significantly outperform unaided experts, with collaboration also reducing inter-rater variability. Notably, transparent AI assistance applied as a targeted QC step yields median event-level performance improvements of approximately 30\% over black-box assistance, and QC timing further enhances count-based outcomes. While WB and QC approaches increase the time required for scoring, start-time assistance is faster and preferred by most participants. Participants overwhelmingly favor transparency, with seven out of eight expressing willingness to adopt the system with minor or no modifications. In summary, strategically timed transparent AI assistance effectively balances accuracy and clinical efficiency, providing a promising pathway toward trustworthy AI integration and user acceptance in clinical workflows.
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
- North America > United States (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Research Report > Experimental Study > Negative Result (0.46)
- Health & Medicine > Therapeutic Area > Sleep (0.67)
- Health & Medicine > Nuclear Medicine (0.67)
- Health & Medicine > Diagnostic Medicine > Imaging (0.67)
Unraveling Human-AI Teaming: A Review and Outlook
Lou, Bowen, Lu, Tian, Raghu, T. S., Zhang, Yingjie
Artificial Intelligence (AI) is advancing at an unprecedented pace, with clear potential to enhance decision-making and productivity. Yet, the collaborative decision-making process between humans and AI remains underdeveloped, often falling short of its transformative possibilities. This paper explores the evolution of AI agents from passive tools to active collaborators in human-AI teams, emphasizing their ability to learn, adapt, and operate autonomously in complex environments. This paradigm shifts challenges traditional team dynamics, requiring new interaction protocols, delegation strategies, and responsibility distribution frameworks. Drawing on Team Situation Awareness (SA) theory, we identify two critical gaps in current human-AI teaming research: the difficulty of aligning AI agents with human values and objectives, and the underutilization of AI's capabilities as genuine team members. Addressing these gaps, we propose a structured research outlook centered on four key aspects of human-AI teaming: formulation, coordination, maintenance, and training. Our framework highlights the importance of shared mental models, trust-building, conflict resolution, and skill adaptation for effective teaming. Furthermore, we discuss the unique challenges posed by varying team compositions, goals, and complexities. This paper provides a foundational agenda for future research and practical design of sustainable, high-performing human-AI teams.
- North America > United States > California (0.14)
- North America > United States > Arizona (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- (2 more...)
- Education (1.00)
- Health & Medicine > Therapeutic Area (0.93)
- Government > Military (0.66)
- Government > Regional Government > North America Government > United States Government (0.45)
Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance
To uncover how AI agents change productivity, performance, and work processes, we introduce MindMeld: an experimentation platform enabling humans and AI agents to collaborate in integrative workspaces. In a large-scale marketing experiment on the platform, 2310 participants were randomly assigned to human-human and human-AI teams, with randomized AI personality traits. The teams exchanged 183,691 messages, and created 63,656 image edits, 1,960,095 ad copy edits, and 10,375 AI-generated images while producing 11,138 ads for a large think tank. Analysis of fine-grained communication, collaboration, and workflow logs revealed that collaborating with AI agents increased communication by 137% and allowed humans to focus 23% more on text and image content generation messaging and 20% less on direct text editing. Humans on Human-AI teams sent 23% fewer social messages, creating 60% greater productivity per worker and higher-quality ad copy. In contrast, human-human teams produced higher-quality images, suggesting that AI agents require fine-tuning for multimodal workflows. AI personality prompt randomization revealed that AI traits can complement human personalities to enhance collaboration. For example, conscientious humans paired with open AI agents improved image quality, while extroverted humans paired with conscientious AI agents reduced the quality of text, images, and clicks. In field tests of ad campaigns with ~5M impressions, ads with higher image quality produced by human collaborations and higher text quality produced by AI collaborations performed significantly better on click-through rate and cost per click metrics. Overall, ads created by human-AI teams performed similarly to those created by human-human teams. Together, these results suggest AI agents can improve teamwork and productivity, especially when tuned to complement human traits.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- Research Report > Strength High (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi
Deep reinforcement learning has generated superhuman AI in competitive games such as Go and StarCraft. Can similar learning techniques create a superior AI teammate for human-machine collaborative games? Will humans prefer AI teammates that improve objective team performance or those that improve subjective metrics of trust? In this study, we perform a single-blind evaluation of teams of humans and AI agents in the cooperative card game Hanabi, with both rule-based and learning-based agents. In addition to the game score, used as an objective metric of the human-AI team performance, we also quantify subjective measures of the human's perceived performance, teamwork, interpretability, trust, and overall preference of AI teammate.
Shifting the Human-AI Relationship: Toward a Dynamic Relational Learning-Partner Model
As artificial intelligence (AI) continues to evolve, the current paradigm of treating AI as a passive tool no longer suffices. As a human-AI team, we together advocate for a shift toward viewing AI as a learning partner, akin to a student who learns from interactions with humans. Drawing from interdisciplinary concepts such as ecorithms, order from chaos, and cooperation, we explore how AI can evolve and adapt in unpredictable environments. Arising from these brief explorations, we present two key recommendations: (1) foster ethical, cooperative treatment of AI to benefit both humans and AI, and (2) leverage the inherent heterogeneity between human and AI minds to create a synergistic hybrid intelligence. By reframing AI as a dynamic partner, a model emerges in which AI systems develop alongside humans, learning from human interactions and feedback loops including reflections on team conversations. Drawing from a transpersonal and interdependent approach to consciousness, we suggest that a "third mind" emerges through collaborative human-AI relationships. Through design interventions such as interactive learning and conversational debriefing and foundational interventions allowing AI to model multiple types of minds, we hope to provide a path toward more adaptive, ethical, and emotionally healthy human-AI relationships. We believe this dynamic relational learning-partner (DRLP) model for human-AI teaming, if enacted carefully, will improve our capacity to address powerful solutions to seemingly intractable problems.
- North America > United States > Michigan (0.05)
- North America > United States > California > San Diego County > San Diego (0.05)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Education (0.49)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.34)
Complementarity in Human-AI Collaboration: Concept, Sources, and Evidence
Hemmer, Patrick, Schemmer, Max, Kühl, Niklas, Vössing, Michael, Satzger, Gerhard
Artificial intelligence (AI) can improve human decision-making in various application areas. Ideally, collaboration between humans and AI should lead to complementary team performance (CTP) -- a level of performance that neither of them can attain individually. So far, however, CTP has rarely been observed, suggesting an insufficient understanding of the complementary constituents in human-AI collaboration that can contribute to CTP in decision-making. This work establishes a holistic theoretical foundation for understanding and developing human-AI complementarity. We conceptualize complementarity by introducing and formalizing the notion of complementarity potential and its realization. Moreover, we identify and outline sources that explain CTP. We illustrate our conceptualization by applying it in two empirical studies exploring two different sources of complementarity potential. In the first study, we focus on information asymmetry as a source and, in a real estate appraisal use case, demonstrate that humans can leverage unique contextual information to achieve CTP. In the second study, we focus on capability asymmetry as an alternative source, demonstrating how heterogeneous capabilities can help achieve CTP. Our work provides researchers with a theoretical foundation of complementarity in human-AI decision-making and demonstrates that leveraging sources of complementarity potential constitutes a viable pathway toward effective human-AI collaboration.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > California > Los Angeles County > Lakewood (0.04)
- Europe > Germany > Bavaria > Upper Franconia > Bayreuth (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area (0.92)
- Banking & Finance (0.68)
ADVISE: AI-accelerated Design of Evidence Synthesis for Global Development
Edwards, Kristen M., Song, Binyang, Porciello, Jaron, Engelbert, Mark, Huang, Carolyn, Ahmed, Faez
When designing evidence-based policies and programs, decision-makers must distill key information from a vast and rapidly growing literature base. Identifying relevant literature from raw search results is time and resource intensive, and is often done by manual screening. In this study, we develop an AI agent based on a bidirectional encoder representations from transformers (BERT) model and incorporate it into a human team designing an evidence synthesis product for global development. We explore the effectiveness of the human-AI hybrid team in accelerating the evidence synthesis process. To further improve team efficiency, we enhance the human-AI hybrid team through active learning (AL). Specifically, we explore different sampling strategies, including random sampling, least confidence (LC) sampling, and highest priority (HP) sampling, to study their influence on the collaborative screening process. Results show that incorporating the BERT-based AI agent into the human team can reduce the human screening effort by 68.5% compared to the case of no AI assistance and by 16.8% compared to the case of using a support vector machine (SVM)-based AI agent for identifying 80% of all relevant documents. When we apply the HP sampling strategy for AL, the human screening effort can be reduced even more: by 78.3% for identifying 80% of all relevant documents compared to no AI assistance. We apply the AL-enhanced human-AI hybrid teaming workflow in the design process of three evidence gap maps (EGMs) for USAID and find it to be highly effective. These findings demonstrate how AI can accelerate the development of evidence synthesis products and promote timely evidence-based decision making in global development in a human-AI hybrid teaming context.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Indiana > Saint Joseph County > South Bend (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.98)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
DDoD: Dual Denial of Decision Attacks on Human-AI Teams
Tag, Benjamin, van Berkel, Niels, Verma, Sunny, Zhao, Benjamin Zi Hao, Berkovsky, Shlomo, Kaafar, Dali, Kostakos, Vassilis, Ohrimenko, Olga
Artificial Intelligence (AI) systems have been increasingly used to make decision-making processes faster, more accurate, and more efficient. However, such systems are also at constant risk of being attacked. While the majority of attacks targeting AI-based applications aim to manipulate classifiers or training data and alter the output of an AI model, recently proposed Sponge Attacks against AI models aim to impede the classifier's execution by consuming substantial resources. In this work, we propose \textit{Dual Denial of Decision (DDoD) attacks against collaborative Human-AI teams}. We discuss how such attacks aim to deplete \textit{both computational and human} resources, and significantly impair decision-making capabilities. We describe DDoD on human and computational resources and present potential risk scenarios in a series of exemplary domains.
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Oceania > Australia (0.04)
- North America > United States > New York (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- (2 more...)
Chattopadhyay
As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game -- GuessWhich -- to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI.
Role of Human-AI Interaction in Selective Prediction
Recent work has shown the potential benefit of selective prediction systems that can learn to defer to a human when the predictions of the AI are unreliable, particularly to improve the reliability of AI systems in high-stakes applications like healthcare or conservation. However, most prior work assumes that human behavior remains unchanged when they solve a prediction task as part of a human-AI team as opposed to by themselves. We show that this is not the case by performing experiments to quantify human-AI interaction in the context of selective prediction. In particular, we study the impact of communicating different types of information to humans about the AI system's decision to defer. Using real-world conservation data and a selective prediction system that improves expected accuracy over that of the human or AI system working individually, we show that this messaging has a significant impact on the accuracy of human judgements.